Language Processing and Learning Models for Community Question Answering in Arabic
نویسندگان
چکیده
In this paper we focus on the problem of question ranking in community question answering (cQA) forums in Arabic. We address the task with machine learning algorithms using advanced Arabic text representations. The latter are obtained by applying tree kernels to constituency parse trees combined with textual similarities, including word embeddings. Our two main contributions are: (i) an Arabic language processing pipeline based on UIMA —from segmentation to constituency parsing— built on top of Farasa, a state-of-the-art Arabic language processing toolkit; and (ii) the application of long short-term memory neural networks to identify the best text fragments in questions to be used in our tree-kernel-based ranker. Our thorough experimentation on a recently released cQA dataset shows that the Arabic linguistic processing provided by Farasa produces strong results and that neural networks combined with tree kernels further boost the performance in terms of both efficiency and accuracy. Our approach also enables an implicit comparison between different processing pipelines as our tests on Farasa and Stanford parsers demonstrate.
منابع مشابه
Formulation of Language Teachers̕ Identity in the Situated Learning of Language Teaching Community of Practice
A community of practice may shape and reshape the identity of members of the community through providing them with situated learning or learning environment. This study, therefore, is to clarify the salient learning-based features of the language teaching community of practice that might formulate the identity of language teachers. To this end, the study examined how learning situations in two ...
متن کاملUsing Generalized Language Model for Question Matching
Question and answering service is one of the popular services in the World Wide Web. The main goal of these services is to finding the best answer for user's input question as quick as possible. In order to achieve this aim, most of these use new techniques foe question matching. . We have a lot of question and answering services in Persian web, so it seems that developing a question matching m...
متن کاملCitizenship Classes for Bhutanese-Nepali Elders: From Cognitive Deficits to Cultural-Historical Understandings
This article focuses on home-based citizenship classes for Bhutanese-Nepali elders in Central Ohio in the United States. As part of a larger longitudinal study centered in the ethnographic, language socialization, and discourse analytic traditions, the article focuses on data, particularly regular audiovideo recordings, gathered over a five-month period and tracks one student’s progress towards...
متن کاملارایه یک پیکره پرسش و پاسخ مذهبی در زبان فارسی
Question answering system is a field in natural language processing and information retrieval noticed by researchers in these decades. Due to a growing interest in this field of research, the need to have appropriate data sources is perceived. Most researches about developing question answering corpus area have been done in English so far, but in other languages as Persian, the lack of these co...
متن کاملLearning Semantic Relatedness in Community Question Answering Using Neural Models
Community Question Answering forums, such as Quora and Stackoverflow contain millions of questions and answers. Automatically finding the relevant questions from the existing questions and finding the relevant answers to a new question are Natural Language Processing tasks. In this paper, we aim to address these tasks, which we refer to as similar-Question Retrieval and Answer Selection. We pre...
متن کامل